A Linear-RBF Multikernel SVM to Classify Big Text Corpora

نویسندگان

  • R. Romero
  • E. L. Iglesias
  • L. Borrajo
چکیده

Support vector machine (SVM) is a powerful technique for classification. However, SVM is not suitable for classification of large datasets or text corpora, because the training complexity of SVMs is highly dependent on the input size. Recent developments in the literature on the SVM and other kernel methods emphasize the need to consider multiple kernels or parameterizations of kernels because they provide greater flexibility. This paper shows a multikernel SVM to manage highly dimensional data, providing an automatic parameterization with low computational cost and improving results against SVMs parameterized under a brute-force search. The model consists in spreading the dataset into cohesive term slices (clusters) to construct a defined structure (multikernel). The new approach is tested on different text corpora. Experimental results show that the new classifier has good accuracy compared with the classic SVM, while the training is significantly faster than several other SVM classifiers.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Distributed Artificial Intelligence Models for Knowledge Discovery in Bioinformatics

The increasing volume of existing information on biological processes and the use of large databases have significantly increased the accessibility of datasets to the scientific community. This has enabled performing an analysis to facilitate the extraction of relevant information or modeling and optimizing tasks in different processes. Parallel to the increasing volumes of information is the e...

متن کامل

MODELING OF FLOW NUMBER OF ASPHALT MIXTURES USING A MULTI–KERNEL BASED SUPPORT VECTOR MACHINE APPROACH

Flow number of asphalt–aggregate mixtures as an explanatory factor has been proposed in order to assess the rutting potential of asphalt mixtures. This study proposes a multiple–kernel based support vector machine (MK–SVM) approach for modeling of flow number of asphalt mixtures. The MK–SVM approach consists of weighted least squares–support vector machine (WLS–SVM) integrating two kernel funct...

متن کامل

Acoustic detection of apple mealiness based on support vector machine

Mealiness degrades the quality of apples and plays an important role in fruit market. Therefore, the use of reliable and rapid sensing techniques for nondestructive measurement and sorting of fruits is necessary. In this study, the potential of acoustic signals of rolling apples on an inclined plate as a new technique for nondestructive detection of Red Delicious apple mealiness was investigate...

متن کامل

Automated identification of biomedical article type using support Vector machines

Authors of short papers such as letters or editorials often express complementary opinions, and sometimes contradictory ones, on related work in previously published articles. The MEDLINE® citations for such short papers are required to list bibliographic data on these “commented on” articles in a “CON” field. The challenge is to automatically identify the CON articles referred to by the author...

متن کامل

Comparative Study of SVM Methods Combined with Voxel Selection for Object Category Classification on fMRI Data

BACKGROUND Support vector machine (SVM) has been widely used as accurate and reliable method to decipher brain patterns from functional MRI (fMRI) data. Previous studies have not found a clear benefit for non-linear (polynomial kernel) SVM versus linear one. Here, a more effective non-linear SVM using radial basis function (RBF) kernel is compared with linear SVM. Different from traditional stu...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره 2015  شماره 

صفحات  -

تاریخ انتشار 2015